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DATACONVERSION METHOD FOR A MULT I BEAM LASER WRITER FOR 
VERY COMPLEX MICROLITHOGRAPHIC PATTERNS 



Field of the invention 



The invention relates to microlithography , in par- 
ticular to the writing of photomasks for computer dis- 
5 plays, microelectronic devices, and precision photoetch- 
ing. It is also applicable to wafers, optical devices and 
a variety of electronic interconnection structures such 
as multichip modules. Other applications are possible, 
such as printing and graphics, as well as laser projec- 
10 tion displays. 

fen Background of the invention 

iff The application discloses a method for data conver- 

p sion at extremely high through-put in a multi-beam laser 

15 plotter. The need for such high capacity comes from two 
p sources : the ever-increasing number of features on pho- 

5^ . tomasks, and increasingly "sophisticated designs. For both 
25 computer displays, consumer TV screens and microelec- 

fy tronic products there is a rapid development towards at 

"5 20 the same time larger sizes and smaller elemental cells, 
yg The development is most dramatic with semiconductor memo- 

ries where a photomask could contain a billion elemental 
geometries or more. Furthermore, the elemental geometries 
need not be rectangular, but could be of any shape. 
25 The input data file may be in a compacted hierachi- 

cal format, but during processing the data volume 
increases immensely (up to 1000-10 000 Gb per mask) and 
it is impossible to process the data beforehand and store 
the data until the time of writing. The datapath must 
30 therefore have enough processing capacity to convert the 
data in real time. 

Another issue is the necessity of a small address 
grid. The writing system for semiconductor masks must be 
capable of writing features specified in units of 10 nm 
35 (nanometers) or less. It has been disclosed in European 
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(nanometers) or less. It has been, disclosed in European 
Patent EP 0 4 67 07 6 by the same inventor that a combina- 
tion of time delays and analog power modulation can be 
used to achieve an arbitrarily small address grid. The 
same patent also discloses the use of several beams and 
parallel data paths to increase the through-put of the 
writing system. 

For a writer with two laser beams two parallel data 
paths may be feasible, but current multibeam writers may 
use up to 32 beams and simple multiplication of a single- 
beam datapath would be practically impossible. 

There is also a strong desire to have unequal num- 
bers of processors and beams, in particular a much larger 
number of processors than beams. A second need is to make 
the system easily scaleable, so that writers for differ- 
ent applications with different requirements on capacity 
can be configured from standard modules and running iden- 
tical software. 

In United States Patent US 5 533 170 a high-through- 
put multibeam data path based on parallel rasterizers is 
disclosed. Each rasterizer, "geometry engine", converts a 
frame of the pattern to a pixel map where each pixel has 
a greyscale value from 0 to 16. The bitmaps are distrib- 
uted to beam boards via a bus system and loaded into a 
5 buffer RAM area in each bus board. 

The method in US 5 533 170 requires very high proc- 
essing power. In particular every pixel has to be filled 
with its proper value and transmitted to the beam boards 
for writing. This is done by signal processors and custom 
0 ASICs. The writing system has a burst pixel rate of 1600 
million pixels per second, and extremely high demands are 
placed on the internal data paths. Therefore a system 
with parallel buses is used and the result is a complex, 
costly and inflexible system. 
5 The present invention devices a method for data con- 

version that can be used on configurations from one 
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beam/one processor to tens of beams /hundreds of proces- 
sors . 

Brief summary of the invention 
5 In the present invention the data conversion is 

divided in two steps: first cutting the geometries in 
scan lines and simplifying them, and then finishing the 
conversion of the scan lines at the point of demand, i.e. 
in a beam processor in the driving electronics for each 
10 beam. The idea is to make as much as possible of the con- 
version at the latest possible point, i.e. at the beams. 
What is needed at an earlier stage is to separate the 
p data for different beams and distribute them, and to sim- 
fi plify the data enough to make sure that the beam proces- 
?gl5 sors can always handle the data flow. 

==j There are benefits with the invention in three 

areas: 

Q - there is nowhere in the system a pixel map that has to 

^ be filled, therefore a lot of processing power is saved 

0320 - keeping the information to the beam processors in geo- 

metrical form instead of as a pixel map gives a smaller 
j3 data volume, making the implementation simpler and more 

flexible. Practical tests indicate savings of 4 - 20 
times depending on the pattern. 
25 - the manipulation of the geometrical data without fill- 
ing operations is well suited for algorithmic programs 
running on a general-purpose processor, while the final 
processing in the beam boards is better served with cus- 
tom-logic. Using general purpose processors gives great 
30 flexibility. It is possible to increase the performance 
simply by moving to faster processors as they become 
available, and it is easy to modify or refine the 
algorithms to follow the needs of the applications. Cus- 
tom algorithms for specific applications or new input 
35 formats are easily implemented. 
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Brief description of the drawings . 

Figure la shows how a round shape 101 combined with 
a triangular shape 102 are represented by a pixel map 103 
with analog intensities (shown as varying shading) . The 
5 beam 104 is scanning parallel lines 105. The size of the 
writing light spot is larger than a pixel, therefore the 
result on the plate will be smoothened to a round figure. 

Figure lb shows the same shapes as in Figure la, but 
where the geometrical shape are cut into segments 106 
10 belonging to different scanlines. 

Figure 1c shows the same shapes as in Figure lb, but 
where the segments are replaced by a simplified new 
segment 107, with only length and width. As in Figure la 
the size of the spot will make the written figure smooth. 
15 Figure Id shows the segments in Figure lc converted 

to analog values by the beam processor. 

Figure 2 shows three beams 201, 202, 203 forming 
interlaced scan lines with the spacing 206. The figure 
shows that the beams scan three lines and then retrace 
20 while the stage is advanced a distance 207 equal to three 
times the scan spacing. There are several possible spac- 
ings 205 between the beams, here two times the scan spac- 
ing 205 is shown. 

Figure 3 shows a preferred embodiment of the inven- 
25 tion with two beams and two segmentizers . 

Figure 4 shows a preferred embodiment with tree 
beams and four segmentizers. 

Figure 5 shows how data is buffered to allow all 
components to run continuously at full capacity in 
30 another preferred embodiment with four segementisers and 
three beams . 

Function of the invention 

Figure 3 shows an embodiment with two processors and 
35 two beams writing on a workpiece 301 using a demagnifi- 
cation and focusing lens 305. The scanning and advance- 
ment between the scans, not shown in this figure, can be 
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done by the stage or the beams or . by a combination of the 
two- The pattern, shown as a figure 306 in a square 
window 307, is described in the input data read from tape 
308 or from a network 309. The input can be stored on 
local mass storage 310, e.g. on a local hard disk, by the 
host computer 311. The host computer sends the input data 
to. the segmentizers 312, 313 after having performed any 
necessary format conversions, scalings, expansion of 
hierachical structures, etc. It may use mass storage 310 
for intermediate storage at any time. Furthermore it cuts 
the data into fields that are suitable to the length of 
the scan lines and to the size of the data buffers in the 
data path. Depending on the complexity of the data a 
field can be chosen to be a full writing swath or part of 
a swath. 

The host computer sends the data for each field to 
one of the segmentizers 312, 313, typically in the order 
they need to be written and to the first available seg- 
mentizer. The host computer maintains a table of where 
the data for each field is and its status. 

The segmentizers cut the data to each scan line and 
forms a list of geometrical elements for each scan line 
and a list of scan lines 316, 317. Although the function 
of the invention does not depend on it, the segmentizer 
may simplify the geometries in each scan line, remove any 
overlapping geometries and form segments that are rect- 
angles with length and width and sort both the lists of 
. segments and the list of scan lines in order of use by 
the writing hardware. 

The list of scan lines are sent to the interlace 
resolvers 314, 315 where the scan lines are separated 
depending on which beam they will be written by. New 
interlaced lists for each beam are assembled. In Figure 3 
the list 317 is split into the interlace lists 318 and 
319 that are sent to beam processor units, e.g. beam 
processor boards 320, 321, each with a beam processor 322 
and a modulator 323. In the beam processor boards the 
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simplified geometry in the scan lists is resolved and 
converted to amplitude and time modulation of the laser 
beams. Since the beams are scanning the workpiece in 
parallel the interlaced patterns 324, 325 are reassembled 
in the exposed pattern. 

Since only one field is written at a time only one 
interlace resolver can send data to the beam processors 
at a time as is shown by the heavy lines from 315 to 320, 
321, unless the transfers are buffered so that the pro- 
cessing in the beam processors is decoupled from the 
datainpiit . 

For a simple case with a small number of beams the 
distribution can be done by a multiplexor, i.e. a logic 
circuit that accepts a single input data stream from the 
segmentizer/segmentizers and directs data items to 
different outputs according to either their position in 
their stream or a tag in the data item. 

Figure 3 shows the method in schematic form and in a 
practical implementation details may vary, e.g. the two 
modulators can be a single physical device with two 
channels, each segmentizer can use one or several 
processors etc. 

Preferred embodiments 

A preferred embodiment of the invention is in a 
three-beam laser writer for semiconductor reticles, as is 
shown in Figure 4 . The writer has a distance between the 
scan lines of 0.25 jim and a shortest segment length of 
0.2 5 \xm. The maximum conversion burst rate in the beam 
processors is 60 million segments per second and the 
system is writing approximately 60 % of the total time. 
Accordingly the system writes 3* 0.25 ^im * 0.25 urn* 60% 
* 60 million = 6.75 sq.mm/s. 

The data distribution network must be dimensioned 
for the worst possible case, i.e. the entire area filled 
with segments of minimum length, or else it is possible 
to supply an input data file that causes the system to 
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malfunction due to data overload. . Each beam processor has 
a maximum burst rate of 60 million segments per second 
and each segment is described by two data bytes. The 
three beam processors therefore have a maximum data 
consumption of 360 Mb/s, corresponding to 180-240 Mb/s 
maximum sustained average rate. 

The links between the interlace resolvers and the 
beam processors are implemented as a cross-switch network 
of parallel links. Each link has a transfer rate of 180 
Mb/s and the shown network can at any time support three 
simultaneous transfers. The throughput of the links 
between the segmentizers and the beam processors is 3 x 
180 mb/s = 540 Mb/s burst rate which is more than ade- 
quate for the worst possible pattern including overhead. 
Alternatively a simpler network can be used supporting 
two or only one transfer. 

Figure 5 shows how generous buffers allow all compo- 
nents to work independent of all others. The heavy lines 
show current data transfers. The interlace resolvers 
(IR1-IR4) have two output buffers, one for storing new 
lists being worked on and one for storing the previous 
list waiting for transfer to the beam processors. Since 
the segmentizers are typically slower than the interlace 
resolvers the buffer memory between S and IR need not 
store any data, it needs only be large enough to allow S 
and IR to work in an asynchronous mode. 

The beam processor units have FIFO buffers with room 
for several fields. Field n (Fn) is being written and is 
read from all FIFOs simultaneously, Fn+1 is transferred 
from IR1, while IR1 is working on Fn+5. IR2 and IR3 are 
one and two fields ahead of IR1, respectively, and the 
FIFOs of BP2 and BP3 are storing enough data to make the 
bottom of all FIFOs synchronized. 

S4 and IR4 have just finished Fn+4 and IR4 is trans- 
ferring the output from the work buffer to transfer 
buffer. At the same time the host computer HC is loading 
input data for a new field to S4. In actual operation the 
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scheduling and transfer of data is more irregular than 
Figure 5 leads one to believe, since the fields take 
different amounts of time to process and the scheduling 
is based on demand and availability. The buffer memories 
in Figure .5 need not be physically separate but may be 
different areas in the same physical memory, and they may 
be reassigned dynamically. The processors PI to P8 may 
likewise be 8 physical processors, but they may also be 
another number and they may be dynamically reassigned 
between different tasks. 

Figure 5 assumes that data needs to be loaded 
sequentially to the beam processor buffers. Using random- 
access writing instead of FIFOs would allow smaller 
.buffer areas, but at the expense of more overhead and 
more complex management by the host computer. In the 
preferred embodiment FIFOs are used. 

A real pattern will have a data requirement at least 
4 times smaller than the maximum data rate or 45 - 67 
Mb/s. A typical writing field is part of a swath 200 fxm 
wide and 10 mm long needing an absolute maximum of 32 
million segments or 64 Mb data, in practice not more than 
8 million segments or 16 Mb data or 5.3 Mb per beam. 72 
Mb buffer memory in the beam processor units (24 Mb in 
each unit) will then store several fields as shown in 
Figure 5. An occasional field with too much data will 
cause the FIFO buffer to fill up and the pipelining will 
be lost for a couple fields, but the system will recover 
gracefully. With a larger number of processors than beams 
the writing hardware need only wait for data transfers, 
not for processing since the subsequent fields are 
already in the transfer buffers in the IRs. 

The size of the fields can be. changed dynamically, 
so that the field size is made smaller for extremely 
dense patterns and larger for less dense patterns. 

Even in the case where the data to the beam proces- 
sors are only rectangular non-overlapping segments, the 
conversion from geometrical elements to time and power in 
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the beam processor uses a set of rules. First the geome- 
try is converted to the hardware-supported time and power 
resolution. Secondly, the linearity between the power in 
the beam and the position of the edge is only approxi- 
5 mate. When the beam is only slightly larger than the dis- 
tance between two scan lines, the transient function is 
s-shaped and on some photo-sensitive materials there is 
an additional sag. Therefore it is advantageous to make 
an empirical calibration and store the calibration curve 

10 as a lookup table. Furthermore, if the geometrical line- 
arity of the scan line is not perfect a stored geometri- 
cal correction table is useful. 

The invention and embodiments satisfy the need for a 
real-time data conversion system for a wide range of 

15 applications, also the most demanding. In particular 

there is no hard limit to the number of processors that 
can be used in typical embodiments, since they use cross- 
switch network that is more easily extendible than bus 
systems. Systems designed according to the invention can 

20 also evolve with the rapidly increasing requirements on 
capacity. Since it is suitable to be built with standard 
processors, standard computer boards and software in 
portable high-level language, it can follow the technical 
de velopment which has given a tripling of speed every two 

25 years in the past. 

The terms and expressions which have been employed 
in the foregoing specification are used therein as terms 
of descritption and not of limitation, and there is no 
intention in the use of such terms and expressions of 

30 excluding equivalents of the features shown and descibed 
or portions thereof, it being recognized that the scope 
of the invention is defined and limited only by the 
claims which follow. 
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